PHNLO 10450 

1 05.07.2001 

Linking in parametric encoding 



The invention relates to a linking unit according to the preamble of claim 1. 
The Unking unit serves for generating linking information indicating components of 
consecutive (typically overlapping) extended segments sp and sc which may be linked 
together in order to form a sinusoidal track, the segments sp and sc approximating 
consecutive segments of a sinusoidal audio or speech signal s. 

The invention further relates to a parametric encoder according to the 
preamble of claim 8 and a method for generating said linking information according to the 
preamble of claim 9. 



In the prior there are known two substantially different approaches for 
providing the linking information L used to establish sinusoidal tracks over consecutive 
segments. According to a first approach as described in the WO 00/79519 (PHN 017502 
EP.P) partial signals of an original audio or speech signal are reconstructed based on 
sinusoidal input data including amplitude, frequency and phase information from a previous 
and a current segment. These reconstructed partial signals are compared with the original 
audio- or speech signal. The weighted mean-squared error signal was proposed as a criterion 
to select relevant hnks, i.e. to generate the Unking information L. 

This first approach does not only take amplitude and firequency information 
into account for optimally linking consecutive segments but also considers phase information 
of the components of the previous and the current segment. However, the drawback of this 
first approach is its computational burden and the fact that the original signal is required to 
generate the linking information. 

According to a second approach known in the art the linking information is 
generated by only considering the amplitude and the fi-equency information from the 
sinusoidal code data from the current and the previous segment but not their phase 
information. Said second approach is now described by referring to Fig. 5. 

Fig. 5 shows a Unking unit 500 as described in the preamble of claim 1. It 
comprises a calculating unit 520 for generating a similarity matrix S(m,n) in response to 
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received sinusoidal code data Dp', Dc'. Said sinusoidal code data include information about 
the amplitudes and the frequencies of M components Xm with m = 1 ...M of the extended 
previous segment sp and of N components yn with n = 1 ...N of the extended current segment 
sc. The similarity matrix S(m^) represents the similarity between the m'th component X« of 
said extended previous segment sp and the n'th component y„ of said extended current 
segment sc for m = 1 ...M and n = 1 ...N. Said similarity matrix S(m,n) is input into an 
evaluating unit 540 which evaluates said similarity matrix in order to generate said linking 
information L by selecting those pairs of components m,n the similarity of which is maximal. 

Consequently, the linking information L mdicates those pairs of components 
of consecutive extended segments which may be linked together when restoring the audio or 
speech signal s after storage or transmission such that transitions between consecutive 
segments or components thereof are as smooth as possible. Smooth transitions lead to an 
improved quality of tiie restored signal. 

Hereinafter linked components continuing over consecutive segments are 

referred to as sinusoidal track even if the separate components include slight variations, e.g. 

amplitude or frequency variations. 

An advanced application of that second approach has been described by B. 

Edler, H. Pumhagen, and C. Ferekidis, in "ASAC-Analysis/synthesis codec for very low bit 

rates", Preprint 4179 (F-6) 100* AES Convention, Copenhagen, 1 1-14 May, 1996. 

In that article the authors propose a combination of relative distances in 

frequency and amplitudes as an additional criterion for generating the linking information. 

Expressed in other words, the linking information indicates if and which components of the 

previous and the current segment are considered to be local estunates belonging to the same 

smusoidal crack. 

Advantageously according to the second approach the generation of the 
linking information is done without considering the original audio or speech signal; however, 
since generation of the Imking information according to the second approach is based on 
estimated sinusoidal code data only, the generated linking information may be wrong and 
incorrect tracks may be provided. 

Startmg from said second approach it is the object of the present invention to 
further develop a known linking unit, a parametric encoder and a method for generatmg 
linking information such that the selection of components of consecutive segments suitable 
for beuig linked together is improved resulting in a defmition of a correct sinusoidal track. 
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That object is solved by the subject matter of claim 1. According to the 
characterising portion of claim 1 enlarged sinusoidal code data shall be provided comprising 
not only amplitude and frequency information but also information about the phase of at least 
some of the M components Xm and at least some of N components yn. Further, the calculation 
unit of a linking unit is adapted to calculate the similarity matrix S(m,n) by additionally 
considering the phase consistency between m'th component Xm of the extended previous 
segment sp and the n'th component y„ of the extended current segment sc. 

Advantageously, the proposed linking unit does only use estimated sinusoidal 
code data including phase information for generating the linking information. By additionally 
considering the phase information a more accurate determination of the similarity matrix and 
thus, a more reliable - in comparison to the second approach known in the art - determination 
of the linking information is possible without considering the original audio or speech signal 
s. 

According to a first embodiment the calculating unit comprises a first pattern 
generating unit for generating said M complex components Xm(t) of the extended previous 
segment sp and a second pattern generating unit for generating said N complex components 
yn(t) of the extended current segment sc. The explicit calculation of these complex and time- 
dependent components is required according to the invention in order to be able to evaluate 
the phase consistency between each of said components of the previous and of the current 
segment. 

Advantageously, the calculating module is adapted to calculate the similarity 
matrix S(m,n) as a product of a first similarity Sl(m,n) representing the similarity in shape 
and a second similarity matrix S2(m,n) representing the similarity in amplitude between the 
components m and n. Further, advantageous embodiments of the linking unit are subject 
matters of the dependent claims 4 to 7. 

The object of the invention is further solved by a parametric encoder 
according to claim 8 and a method for generating linking information according to claim 9. 
The advantages of the parametric encoder and of the method substantially correspond to the 
advantages mentioned above by referring to linking imit. 

Five figures are accompanying the description, wherein 

Fig. 1 shows a linking imit according to the invention; 

Fig. 2 shows a more detailed illustration of a calculating unit of the linking 
unit according to Fig. 1 ; 
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Fig. 3 illustrates the similarity of two components of two consecutive 

segments; 

Fig. 4 shows a parametric encoder according to the present invention; and 
Fig. 5 shows a linking unit known in the art. 



Before a preferred embodiment of the invention will be described by referring 
to the figures a preliminary remark is made for providing some backgroimd information 
about the sinusoidal modelling of the signal segments in general. 

In sinusoidal modelling, the models are typically of the form (or can be 
rewritten as such) 



seg(t)=X9lK(0} (0) 



where seg is a segment approximating or modelling a segment of a sinusoidal signal s. In 
these models the segment seg is represented by an extension as given on the right-hand sight 
of equation (1), wherein 9? denotes the real part of a complex variable and Uk are the K 
underlying sinusoidal or sinusoidal-like segment components of the segment seg. 

In particular, for a pure first sinusoidal model (extension), the segment's 
components are 

w,(0 = 4^ ^ (1) 
with Ak, cok and |ik (real-valued) amplitude, frequency and phase, respectively, and j = 

According to a second model the components of the segment are defined as: 



(2) 
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where Ak, cok and [ik are as in the pure sinusoidal model and an additional parameter ak 
appears, ak is a real parameter which captures amplitude changes within a segment. 
A third, more elaborated known model based on polynomial is: 



I M 



1 ^ 



/i=0 



(3) 



with real parameters bk,™ and or complex amplitudes Bk,m = b, e^^'^ . 



Finally, accordmg to a fourth model, the components of the segments are 



defined as: 



M ( N 

«*(0 = ZC*..^"'exp 



m-O 



(4) 



with real parameters ek.n and complex parameters Ck,m. 

If two consecutive signal segments Sp and Sc (previous and current segment, 
respectively) are considered then there is typically an overlap in their support Hereinafter uj, 
in the previous segment is denoted by (m=l , ...,M) and ut in the current segment is 
denoted by y„(n=l, ...,N). In order that profitable (in a coding sense) links are estabhshed, it 
seems reasonable to speak of a link between a component m fi-om Sp and a component n fi-om 
Sc only if Xni(t) and y„(t) are similar within the overlap area. 

Li the following preferred embodiments of the invention will be described by 
referring to Figs. 1 to 4. 

Fig. 1 shows a linking unit 100 according to the present invention . It 
comprises a calculating unit 120 for generating a similarity matrix S(m,n) and an evaluating 
unit 140 for generating linkmg information L. The operation of the calculating unit 120 
substantially corresponds to the operation of the calculating unit 520 and the operation of the 
evaluating unit 140 substantially corresponds to the operation of the evaluating unit 540 
known in the art and described above by referring to Fig. 5. However, there are the following 
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differences between the operation of the linking unit 100 according to the invention and the 
linking unit 500 known in the art. 

The calculating unit 120 does not only receive sinusoidal code data in the form 
of amplitude and frequency data of the previous and the current segment but receives 
enlarged sinusoidal code data further comprising information about the phase of all of the 
components Xm of the previous segment sc and each of the N components yn of the current 
segment sc. 

Consequently, the calculating unit 120 is adapted to calculate the similarity 
matrix S(m,n) not only by considering the amplitude and frequency data but additionally by 
considering the phase consistency between the m'th component x^ of the extended previous 
segment sp and Ihe n'th component yn of the extended current segment sc for m = 1 ...M and 
n = 1 ... N. The evaluating unit 140 receives and evaluates the similarity matrix S(m,n) output 
from said calculating unit 120 in order to generate said linking information L by selectmg 
those pairs of components (m,n) the similarity of which is maximal. 

Fig. 2 shows a detailed illustration of the calculating unit 120 according to the 
invention. It can be seen that the calculating imit 120 comprises a first pattem generating unit 
122 for genemting said M components x^Ct) with m = 1 ,..M of the extended previous 
segment sp in response to the previous segment's enlarged sinusoidal code data (Dp). 
Further, the calculating unit 120 comprises a second pattem generating imit 124 for 
generating said N components yn(t) with n = 1 ... N of the extended current segment Sc in 
response to the current segment s enlarged sinusoidal code data (Dc). Finally, the calculating 
unit 120 comprises a calculating module 126 for calculating the similarity matrix S(m,n) on 
the basis of said received M components Xm(t) and of said received N components y„(t) 
according to a predefined similarity measure. Examples for the similarity measure are given 
below. 

The components Xni(t) and yn(t) are explicitly generated and input to the 
calculation module 126 in order to determine the phase consistency between two components 
m and n and to use that phase consistency information for calculating the similarity matrix. 

In the following two embodiments of the invention will be described for 
carrying out the calculation of the similarity matrix S(m,n). Both embodiments have in 
common that the similarity matrix is preferably but not necessarily calculated by multiplying 
a first similarity matrix Si(m,n) representing the similarity in shape between the two 
components m and n with a second similarity matrix S2(m,n) representing the similarity in 
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amplitude between said components m and n. Then the similarity matix is calculated 
according to: 



S (m, n) = (m^ n) S2 (m, n) . 



(5) 



S(m,n) = 0 means that there is no link and the larger S(m,n) is, the more likely 
it is that this can be exploited profitably as a link in a sinusoidal coding scheme. 

The first embodiment for calculating the similarity matrix S is based on the 
consideration of the similarity of the previous and the current segment within a complete 
overlapping area. The aim of said first embodiment is to identify components of the previous 
and the current segment which are similar. This can be done by a correlation method. Thus, 
according to the first embodiment a correlation coefficient /3b,n is defined by 



Pm^n 



(6) 



where Xm (m = [1,M]) represents a set of components Xm of the previoxis segment Sp and yn(n 
= [1 ,N]) represents the set of components yn of the current segment Sc. Further, w(t) 
represents a window fimction and Exm represents the energy in the signal Xm according to: 



E«„=s>^(o^„(o^:(o 



(7a) 



Analogously, Ey„ represents the energy in the component yn according to 



(7b) 



Consequently, /Vn is a complex number which, for a hnk, should be close to 
1. Therefore, the first similarity matrix Si(m,n) is built as a (partial) similarity measure by: 



iS,(w,«)= 



1 



-1 /D„ // p^^ - I <D„ 



(8) 



0, elsewhere 
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withO<Di< 1. 

Additionally, the equivalence in amplitude (or, more particular, in energy) can 
be taken into account by considering: 



Rn,,„ = min < 



E 'E 

yn xm 



(9) 



gain, for a link, R should be a value close to 1 (in contrast to /3hi,ns Rm,n is real- 
valued) and as similarity measure can act S2(m5n) defined by 



0, 



elsewhere 



(10) 



withO<D2< 1. 

f the previous segment sp is represented by M components and if the current 
segment sc is represented by N components the first matrix Si and the second matrix Sa as 
well as the overall similarity matrix S are M x N matrices. The entries of said matrix S 
establish if there exist links and, if so, which are the most profitable ones. The most 
profitable ones are the ones the similarity values of which are maximal. This evaluation of 
the similarity matrix S(m,n) is done in the evaluating unit 140. 

he second embodiment of the invention for calculating the similarity matrix S 
represents a simplification of the first embodiment. More specifically, not the whole 
overlapping region between the consecutive segment but only the mid point of said region is 
considered. At this point, hereinafter referred to as sample to, it is 



Xm(to)« yn(to) 



(11) 



In that second embodiment it is appreciated that in the neighbourhood of to the 
components are matched as well. This is realised if the progression (the stride) in the 
components is (nearly) the same. This is preferably evaluated by the ratio of the components 
of the two consecutive segments Sp and Sc according to 
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(12) 



In order to select links the &st (partial) similarity matrix is now defined as: 



Siim.n) = 



1- 



0, 



1 



/D„ if 



-1 



< A 



elsewhere 



(13) 



with0<D3< 1. 

Here, the amplitude similarity is involved in a relative way. This agrees with 
psycho-acoustic relevance and distance criteria. 

The second partial similarity matrix S2 is defined as: 



^2 (/»,«) = ' 



-1 



0, 



-1 



elsewhere 



(14) 



withO<D4<L 

The second embodiment for calculating the overall similarity matrix S differs 
from the first embodiment in that the components Xm and yn need only to be generated at 
specific instances, namely to and to+1. 

Fig. 3 illustrates the operation of the linking unit of the present invention. It is 
shown that a component Xm(t) of a previous segment Sp at least partially overlaps with a 
component yn(t) of a consecutive current segment Sc in an overlap region OR. The calculation 
unit 120 and in particular the calculating module 126 are adapted to analyse the similarity 
between these two components within tiie overlap region. If the two components are identical 
at least within said overlap region as shown in Fig. 3 the corresponding entry in the similarity 
matrix S(m,n) would be set to one or at least close to one. The amplitude, firequency and 
phase similarity would be recognised and evaluated by the evaluating unit 140 with the result 
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that the linkmg information L generated by said evaluating unit 140 in Fig. 1 would indicate 
that these two components are local estimates belonging to the same sinusoidal track. 

Fig. 4 shows a parametric encoder 400 according to the present invention. Said 
encoder serves for encoding an audio- and/or speech signal s into a data stream ds including 
sinusoidal code data and linking information. The encoder 400 comprises a segmentation unit 
410 for segmenting said signal s into at least a previous segment sp' and a consecutive 
current segment sc'. The encoder 400 further comprises a sinusoidal estimating unit 420 for 
generating said sinusoidal code data in tiie form of frequency, amplitude and phase data of M 
components Xm with m = 1 ... M of an extended previous segment sp approximating said 
segment sp' and of N components yn with n = 1 ... N of an extended current segment sc 
approximating said segment sc'. Said sinusoidal code data output from said sinusoidal 
estimating unit 420 is input to the linking unit 100 as described above by referring to Fig. 1 
for generatmg the linking information L. Said linking information is input into an arranging 
unit 430 for generating the data stream by appropriately arranging or mixing, e.g. 
multiplexing the sinusoidal code data output from said sinusoidal estimating unit 420 with 
said linking information. The arranging unit 430 is preferably embodied as multiplexer. 

For real audio signals it has been noted that taken in phase information 
improves the quality of the coded material. However, in the encoder 400 the phase 
information is used only if a continuation of a track parametric is searched. If a frequency 
from the data of the previous frame does not have a backward connection (i.e., it is not yet a 
track but may, after Unking with the current frame date, become the start of a track) then the 
phase information is used but relayed on the previous linking procedures based on frequency 
and amplitude data only. The reason for this is that at the start of the track the phase is 
usually not well-defined. This means that the linking information of the previous segment sp 
is input to the calculating module 126 in Fig. 3 for steering purposes. 

Instead of looking at (relative) differences between complex values Xm and ym, 
also the real and imaginary parts or amplitudes and phases can be looked at and can be used 
to construct the similarity criterion. This has the advantage that instead of the two parameters 
that control the above given similarity measure, one or more parameter per considered 
variable is received. Therefore, expressed in real parameters instead of complex ones, it 
typically ends up with twice as many parameters. E.g., splitting the complex signals into 
amplitudes and phases has the interesting property that it is easier that the similarity measure 
for the phases can be made frequency-dependent. 
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It should be noted that the above-mentioned embodiments illustrate rather than 
limit the invention, and that those skilled in the art will be able to design many alternative 
embodiments without departhig from the scope of the appended claims. In the claims, any 
reference signs placed between parentheses shall not be construed as limiting the claim. The 
word 'comprising' does not exclude the presence of other elements or steps tiian those listed 
in a claim. The invention can be implemented by means of hardware comprising several 
distinct elements, and by means of a suitably programmed computer. In a device claim 
enumerating several means, several of these means can be embodied by one and the same 
item of hardware. The mere fact that certain measures are recited in mutually different 
dependent claims does not indicate that a combination of these measures cannot be used to 
advantage. 



